Skip to content

Conversation

@Robdel12
Copy link
Contributor

@Robdel12 Robdel12 commented Jan 5, 2026

Summary

  • Visual diffs no longer break tests by default - SDKs can now choose whether to fail on diffs
  • Server returns 200 for diffs with status: 'diff' instead of 422 error
  • Added --fail-on-diff CLI flag for vizzly tdd start and vizzly tdd run

Changes

Server:

  • Visual diffs return 200 with status: 'diff' instead of 422
  • Consistent response format with status field: 'match', 'diff', 'new', or 'baseline-updated'

CLI:

  • --fail-on-diff flag on vizzly tdd start and vizzly tdd run
  • failOnDiff stored in server.json for SDK discovery

Ember SDK:

  • Read failOnDiff from server.json, env var, or per-snapshot option
  • Gracefully skip snapshots when no server running (like Percy)
  • failOnDiff option on vizzlySnapshot() for per-snapshot control

Usage

# Default: diffs logged as warnings, tests pass
vizzly tdd start

# Fail tests on visual diffs
vizzly tdd start --fail-on-diff

# Or via environment variable
VIZZLY_FAIL_ON_DIFF=true npm test

# Or per-snapshot
await vizzlySnapshot('critical-ui', { failOnDiff: true });

Test plan

  • Updated tdd-handler.test.js for new 200 response format
  • Added tests for diff response structure
  • Added tests for getServerInfo() reading failOnDiff
  • All CLI tests pass
  • All Ember SDK tests pass

Visual diffs no longer break tests by default - SDKs can now choose whether
to fail on diffs via CLI flag, env var, or per-snapshot option.

**Server changes:**
- Visual diffs return 200 with `status: 'diff'` instead of 422
- Consistent response format: `status` field is 'match', 'diff', 'new', or 'baseline-updated'
- SDKs can decide whether to fail tests based on the status

**CLI changes:**
- Add `--fail-on-diff` flag to `vizzly tdd start` and `vizzly tdd run`
- Store `failOnDiff` in server.json for SDK discovery

**Ember SDK changes:**
- Read `failOnDiff` from server.json, env var (VIZZLY_FAIL_ON_DIFF), or per-snapshot option
- Gracefully skip snapshots when no server is running (like Percy)
- Add `failOnDiff` option to `vizzlySnapshot()` for per-snapshot control

**Priority order for failOnDiff:**
1. Per-snapshot option: `vizzlySnapshot('name', { failOnDiff: true })`
2. Environment variable: `VIZZLY_FAIL_ON_DIFF=true`
3. CLI flag: `vizzly tdd start --fail-on-diff`
4. Default: false (diffs logged as warnings, tests pass)
@claude
Copy link

claude bot commented Jan 5, 2026

Code Review: Add --fail-on-diff option and improve SDK resilience

Overall, this is a well-designed and well-tested PR that successfully decouples visual diff detection from test failure. The implementation is clean, consistent, and follows good engineering practices.


✅ Strengths

1. Excellent API Design

  • The priority hierarchy (per-snapshot > env var > server config > default) is intuitive and flexible
  • Returning 200 with status: 'diff' instead of 422 is the right choice - it's not an error, it's data
  • Consistent response format across all status types (match, diff, new, baseline-updated)

2. Strong Test Coverage

  • Comprehensive unit tests for server response formats
  • Tests for config discovery and priority resolution
  • Tests for graceful degradation when server isn't running
  • All existing tests updated to handle new response format

3. Good Documentation

  • Clear JSDoc comments explaining the priority chain
  • Helpful examples in the PR description
  • Code comments explaining the reasoning

4. Backwards Compatibility

  • Default behavior (diffs don't fail tests) maintains existing workflow
  • Graceful handling when Vizzly server isn't running (Percy-style behavior)

🔍 Issues & Suggestions

1. Missing Test: Per-Snapshot failOnDiff Override

Severity: Medium

The Ember SDK supports vizzlySnapshot('name', { failOnDiff: true }) but there's no test verifying this actually works. This is a critical feature that should be tested.

Location: clients/ember/src/test-support/index.js:278

2. Inconsistent Error Handling for Missing diffPercentage

Severity: Low

In clients/ember/src/test-support/index.js:282, you use optional chaining for result.diffPercentage?.toFixed(2), but the server always returns diffPercentage when status === 'diff'. The optional chaining suggests the field might be missing, which could hide bugs.

Location: clients/ember/src/test-support/index.js:282

3. Potential Race Condition in Server Discovery Cache

Severity: Low

The cachedServerInfo in snapshot-server.js is cached globally but only cleared in tests. If the TDD server is restarted with different failOnDiff settings during a long-running test session, the cache won't reflect the new value.

Location: clients/ember/src/launcher/snapshot-server.js:37-40

4. Missing Environment Variable Documentation

Severity: Low

The PR adds VIZZLY_FAIL_ON_DIFF environment variable, but it's not documented in CLAUDE.md (which lists other env vars around line 120).

5. Function Name Could Be Clearer

Severity: Very Low (nitpick)

The function shouldFailOnDiff() in test-support/index.js:142 only checks the environment variable, not the full priority chain. Consider renaming to getEnvFailOnDiff() for clarity.


🎯 Code Quality

Excellent:

  • Clean separation of concerns (server returns data, SDK decides what to do with it)
  • No breaking changes to existing API
  • Consistent naming conventions
  • Proper error messages with actionable guidance

🔒 Security & Performance

No security concerns identified. The changes don't introduce any new attack surfaces or handle sensitive data.

Performance impact: Negligible. Minimal I/O impact and trivial CPU cost for boolean flag checks.


✨ Final Verdict

This is production-ready with minor improvements recommended. The core design is sound, the implementation is clean, and test coverage is good. Address Issue #1 (missing per-snapshot test) before merging, and consider the other suggestions for future polish.

Estimated risk: Low - Changes are well-isolated, backwards-compatible, and thoroughly tested.

Nice work on this feature! The API design (especially the priority chain and the 200-not-422 decision) shows thoughtful engineering.

@Robdel12 Robdel12 merged commit f92aec0 into main Jan 5, 2026
23 of 25 checks passed
@Robdel12 Robdel12 deleted the rd/ember-bug-fixes branch January 5, 2026 08:17
Robdel12 added a commit that referenced this pull request Jan 15, 2026
Fixes #179 - Vitest client was checking for 'passed'/'failed' status
values but PR #169 changed the TDD handler to return 'match'/'diff'.

**Vitest SDK fix:**
- Changed status checks from 'passed' to 'match'
- Changed status checks from 'failed' to 'diff'
- Added handling for 'baseline-updated' status

**CI improvements:**
- Added `test-sdk-e2e` job that always runs (not conditional on path changes)
- Tests Core JS Client, Vitest, Ember, and Ruby SDKs end-to-end
- Removed redundant `test-site` job (now part of test-sdk-e2e)
- Shared Playwright browser cache across SDK tests

This ensures CLI changes that break SDK integrations are caught before merge.
Robdel12 added a commit that referenced this pull request Jan 15, 2026
## Summary

- Fix Vitest SDK to use correct status values from TDD handler
(`match`/`diff` instead of `passed`/`failed`)
- Add `test-sdk-e2e` CI job that always runs to catch CLI changes that
break SDK integrations
- Remove redundant `test-site` job (consolidated into `test-sdk-e2e`)
- Add shared Playwright browser cache across SDK tests for faster CI

Fixes #179

## Changes

**Vitest SDK (`clients/vitest/src/setup.js`):**
- Changed `'passed'` → `'match'` status check
- Changed `'failed'` → `'diff'` status check  
- Added handling for `'baseline-updated'` status

**CI Workflow (`.github/workflows/ci.yml`):**
- Added `test-sdk-e2e` job that runs on every PR/push (not conditional
on SDK file changes)
- Tests Core JS Client, Vitest SDK, Ember SDK, and Ruby SDK end-to-end
- Consolidated `test-site` into `test-sdk-e2e` 
- Shared Playwright browser cache (chromium + firefox) across all SDK
tests

## Context

PR #169 changed the TDD handler to return `'match'`/`'diff'` instead of
`'passed'`/`'failed'`, but the Vitest client wasn't updated. This caused
the error "Unknown comparison status: match" reported in #179.

The SDK E2E tests would have caught this - they now run on every PR to
prevent similar regressions.

See #180 for follow-up work to add Storybook and Static-site SDK E2E
tests.

## Test plan

- [x] CLI lint passes
- [x] CLI tests pass (1784/1784)
- [x] Vitest client lint passes
- [x] Vitest client unit tests pass (12/12)
- [x] CI workflow YAML is valid
- [ ] CI `test-sdk-e2e` job passes (will verify in this PR)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants